RD-A145  789 
UNCLASSIFIED 


SOME 


ESTIMATION  IN  NONLINEAR  TINE  SERIES  MODEL  II:  _ 

NONSTATIONARV  SERIESCU)  NORTH  CAROLINA  UNIV  AT  CHAPEL 
HILL  DEPT  OF  STATISTICS  D  TJOSTHEIM  JUL  84  TR-71 
AF0SR-TR-84-8827  F49628-82-C-8889  F/G  12/1 


VI  - 


NL 


microcopy  resolution  test  chart 

NAFIONAl  BUREAU  OF  STANDARDS -1 963- A 


CENTER  FOR  STOCHASTIC  PROCESSES 


Department  of  Statistics 
University  of  North  Carolina 
Chapel  Hill,  North  Carolina 


/ 

( 

ESTIMATION  IN  NONLINEAR  TIME  SERIES  MODELS  II: 


SOME  NONSTATIONARY  SERIES 


by 


Dag  Tjfjstheim 


TECHNICAL  REPORT  #71 


ESTIMATION  IN  NONLINEAR  TIME  SERIES  MODELS  II: 


SOME  NONSTATIONARY  SERIES 


by 


Dag  Tj^stheim 


Department  of  Mathematics 
University  of  Bergen 
5000  Bergen,  Norway 

and 

Department  of  Statistics 
University  of  North  Carolina 
Chapel  Hill,  North  Carolina  27514 


Abstract 

In  an  earlier  paper  (Tj0stheim  1984a)  a  general  framework  was 
introduced  for  analyzing  estimates  in  stationary  nonlinear  time 
series  models.  In  the  present  paper  the  framework  is  enlarged  to 
include  certain  nonstationary  and  nonlinear  series.  General  conditions 
for  strong  consistency  and  asymptotic  normality  are  derived  both 
for  conditional  least  squares  and  maximum  likelihood  type  estimates. 
Examples  are  taken  from  threshold  autoregressive,  random  coefficient 
autoregressive  and  doubly  stochastic  (dynamic  state  space)  models.  The 
emphasis  in  the  examples  is  on  conditional  least  squares  estimates. 
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1.  Introduction 


The  assumption  of  stationarity  imposed  in  Tj^stheim  (1984a) 
(hereafter  referred  to  as  Tl)  is  sometimes  too  strict.  In  this  paper 
we  will  try  to  extend  the  general  framework  established  in  Tl  to 
some  classes  of  nonstationary  models.  We  will  only  treat  certain 
types  of  nonstationarity,  such  as  that  arising  from  a  nonexisting 
stationary  initial  distribution,  or  the  nonstationarity  arising  from 
a  nonhomogencous  generating  white  noise  process  We  will  also 

briefly  look  at  autoregressive  (AR)  models,  where  the  AR  coefficients 
are  deterministic  functions  over  time. 

The  approach  to  proving  consistency  and  asymptotic  normality  is 
similar  to  the  one  used  in  Tl.  We  rely  on  Theorem  2.1,  but  need  a 
scaled  version  of  Theorem  2.2  of  that  paper.  The  ergodic  theorem 
and  the  Billingsley  (1961)  central  limit  theorem  for  an  ergodic 
strictly  stationary  martingale  difference  sequence  will  not  be 
available  any  more,  and,  as  a  consequence,  a  heavier  use  of  pure 
martingale  arguments  (mainly  martingale  type  almost  sure  convergence 
and  central  limit  theorems)  are  necessary  to  obtain  our  results. 

The  results  are  not  complete,  and  our  examples  are  not  as  general 
as  in  Tl,  but  we  believe  that  they  are  representative  at  least  ^or 
some  of  the  difficulties  arising. 

2.  Conditional  least  squares. 

Throughout  the  paper  we  will  use  the  same  notation  as  in  Tl. 

Thus  we  let  {X^,  tel)  be  a  d-dimensional  discrete  time  stochastic 
process  taking  values  in  R^  and  defined  on  a  probability  space 
(fi,F,P).  The  second  moments  of  {Xt}  will  be  assumed  to  exist.  The 
index  set  I  is  either  the  set  of  all  integers  or  the  set  of  all 


2 


u 


V*  a 


positive  integers.  We  denote  by  F  the  o-field  generated  by 

{X  ,  s<t}  and  by  X  .  .(8)  the  conditional  expectation  E0(X  IF*  ) 

s  —  t|t-i  fcs  t  t-l 

which  depends  on  an  r-dimensional  parameter  vector  6.  As  in  Tl, 
we  will  often  suppress  8  in  our  notations.  Moreover,  we  denote  by 
j (8)  the  d*d  conditional  prediction  error  covariance 


matrix  defined  by 


ft|t-rE<(xt'!lt|t-i,<xt'xt|,-i)T|Ft-i1 


(2.1) 


We  assume  that  observations  (Xj,..,X  )  are  given,  and  we  intend 
to  estimate  8  by  minimization  of  the  conditional  least  squares 
penalty  function  given  by 


t=m+l  1 


(2.2) 


where  |’|  is  used  to  denote  Euclidean  norm,  and  where  in  practice 

the  lower  summation  limit  has  to  be  chosen  so  that  X  .  is  well-defined 

m+1  |m 

in  terms  of  the  observations  (Xj,...,X^).  Unlike  the  case  of  consistency 
for  stationary  series  in  Tl,  it  will  not  be  possible  to  condition 
on  F*  (m) ,  which  is  the  o-field  generated  by  (X  ,  t-mfs^t-!}*  This  is 
because  we  will  rely  more  on  pure  martingale  arguments,  and  then  we 
need  an  increasing  sequence  of  o-fields.  Hence,  in  this  paper  we  will 

X 

always  condition  with  respect  to  F  .  For  autoregressive  type 
processes  of  order  p  it  will  then  be  possible  to  express 
in  terms  of  (Xj , . . . , X^) if  min(n,m)  _  p. 

The  following  two  theorems  correspond  to  Theorems  3.1  and  3.2  of 
Tl  .  We  denote  by  c"  the  true  value  of  . 


Theorem_2.1:  Assume  that  1 X  *  is  a  d-dimensional  stochastic 


proves 


ss  with  l;  x. I ~ ' 


*•  and  such  that  X  .  .(8)  *  f.  ( X.  I F.  }  is  almost  surely 

1 1 1  - 1  r  t  t-l  ' 


twice  continuously  differentiable  in  an  open  set  B  contaminp  8  .  "orcever, 
assume  that  there  arc  two  positive  constants  and  M,  such  that  for  r >m*l 


•a.  v\ :  :  :;v;  ;  ^ 


CN1:  I  —  mi 


2~t  2~ 

Ja  xt|t-i„A  „  ,o0%  9  xt|t-i,„o%  „ 

CN2-  E|  36.38.  ^  ft|t-lf(3  ^  36.36.  ^  -  M2 

i  3  i  J 


for  i, j=l, . . . ,r. 


CN3 :  lim  inf  Xn.  (6°)  a^S‘  0 
min 

n-Kxj 


n  0 

where  ^m^n(B  )  is  the  smallest  eigenvalue  of  the  symmetric  non¬ 
negative  definite  matrix  An(6^)  with  matrix  elements  given  by 

„n  ,„0,  1  v  t  t-l,ft0N  t  t-l,A 

V6  >  =  i  l  >  36^  B  > 

t=m+l  i  j 

CN4 :  Let  Nr  =  (6:  1 6-6° I  <  6}  be  containeiLin  B.  Then 

0  _2., 


(2.3) 


T^'tlt-l 


lim  sup  6-1  A"  (B)  -  A"  (6°)  ♦  -l  [(X  -X  ,  (6)}T— — 

n-x»  6+0  1J  1J  t=m+l  z  r|  36i36;j 

r  Y  v  rAJ  92Xt]t-l,o0,T  a)S^ 

~  *Xt'Xtft-l(S  ^  36i36j  ^ 


for  i, j=l, . . . ,r. 

Then  there  exists  a  sequence  of  estimators  6n  =  [6^,  • . .  , 6nr]T  such  that 

A 

Bn  a-*-S"  BA  and  such  that  for  e  >  0,  there  is  an  event  in  (£1,F,P) 
with  P(E)  >  1-e  and  an  nQ  such  that  for  n  >  nQ,  3Qn(6n)/36^  =  0, 

A 

i=l,...,r,  and  Q  attains  a  relative  minimum  at  6 
^n  n . 

Proof :  From  the  definition  of  0^(6)  in  (2.2)  it  is  easily  seen  that 
0  X 

{30^(6  )/9fA»^n}  is  a  zero-mean  martingale.  The  increments 
n  *  3(X/36.  -  3Q-  i/3B.  are  such  that  (using  CN1) 


:(ut(6  >) 


W~~(e  )ft|t-l<8  f  M— (B)  5  4M1 


(2.4) 


and  it  follows  from  a  martingale  strong  law  of  large  numbers  (cf. 

-1  0  a .  s . 

Stout  1974,  Th.  3.3.8)  that  n  3(^(8  ) / 33 ^  -*■  0  as  n  -*■  °°,  and  A1 

of  Theorem  2.1  of  T1  is  fulfilled.  Computing  second  order  derivatives 


we  have 

32Q 

xn 

33  -  36  - 

i  J 


3Xt)t-l  3Xtlt-l 


=  2V  —Lilli  — LL 

i  .  33.  33. 

t=m+l  i  j 


n  32X 

"2t  1 

t=m+l 


wttr«t-h it-P-  <2-5> 

i  j  1 


Here  { 3 2 j ^  j ( 3°) / 33^ 33j [X^-X^ | ^  j(S^)]}  defines  a  martingale  difference 
sequence  with  respect  to  { F  }  and  using  CN2  while  reasoning  as  above 


we  have 


i  ^  Q«  n  9  n  3X  i  i  n  3X  ■  ..  a.s. 

?  aspeT  <e  )  -  if  )_^~(e  1  *  0 


(2.6) 


as  n-*»,  and  hence  CN3  implies  A2  of  Theorem  2.1  of  T1 .  Using  (2.5) 
it  is  seen  that  CN4  is  identical  to  A3,  and  the  conclusion  now  follows 
from  Theorem  2.1  of  Tl.  | | 

The  conditions  CN1  and  CN2  may  be  weakened  in  two  directions. 

According  to  Theorem  3.3.8  of  Stout  (1974),  M. ,  i=l,2,  may  be  replaced 
a.  1 

with  M^t  with  (Ka^cl,  i =1,2,  allowing  a  moderate  growth  of  moments 

with  t.  Using  Corollary  3.3.5  of  Stout  (1974)  it  is  seen  that  another 

possibility  is  to  replace  CN1  with 

3v  T  r  av  T  l1+e 

Ej  V*t|t-i(s°),[1°s*  — li^-ce0) { xt-5t | t.i ce°) J 

for  i=l,...,r  for  some  c  >  0  and  CN2  with  the  obvious  analogue. 

/v 

When  we  now  turn  to  the  asymptotic  distribution  of  B  ,  we  cannot 
rely  on  Billingsley's  (1961)  result  for  ergodic  strictly  stationary 
martingale  difference  sequences  which  was  used  in  Theorem  3.2  of  Tl . 
However,  there  are  more  recent  results  from  martingale  central  limit 


<Mj  (2.7) 


»  *  •  -  *  .  •  -  *  a*  •/  4  •  « 

V  */  V  >  V  V.  \  .V/  V, 


Va*.'**.'- V  A  V-  . 


>  •  •  *  » •  •  •  • « 
V. 


theory  that  can  be  applied.  Typically  these  require  a  random  scaling 
factor. 

Let  9Xtjt  j/33  be  the  d*r  matrix  having  j  ^  ^ / 9S  ^ ,  i=l,...,r, 
its  column  vectors  and  let  be  the  rxr  symmetric  non-negative 
definite  matrix  given  by 


R  =  l  h 
n  t=m+l 


't  t-1 


t  t-1 


3b  ‘t  t-1  36 


Moreover,  denote  by  the  stochastic  rxr  symmetric  non-negative 
definite  matrix  defined  by 


?  3xtit-i 


r  =  y  -5I 

"  t=m*l  96 


We  will  denote  by  A  the  Moore-Penrose  inverse  of  a  matrix  A  and 
by  det(A)  the  determinant  of  A.  Then  we  have 

Theorem  2.2:  Assume  that  the  conditions  of  Theorem  2.1  arc  fulfilled 
and  assume  in  addition  that 

DN1 :  lim  inf  n"r  det{R  (6°)}  >  0 

n  ->  00 


Q  |  r> 

DN2:  R  2(6  )  l 


n  3X 


t|t-lfD0 


t=m+l 


(6  Mxt-Xt)t_iC6  )Hxt-xt|t_ 


't  t-1 


3°)  R^(60)  - 


where  1^  is  the  identity  matrix  of  dimension  r. 

A 

Let  {6  I  be  the  estimators  obtained  in  Theorem  2.1.  Then 


R"2(B  )T  (6  )(B  -6  )  -  W(0 , 1 J 


Proof:  Mote  first  of:  all  that  R^(8  )  is  finite  from  CN1 .  Let 


i  aQn 


1  _ _  A  S  =  -V 

2  a?  =  n  £ 


t=m+l 


t=m+l 


(2.11) 


Since  we  are  dealing  with  an  asymptotic  result,  as  in  the  proof  of  Theorem  2.2  of 

Klimko  and  Nelson  (1978),  we  may  assume  that  S^(B  )=0.  Taylor 

0  u  0 

expanding  S  about  0  and  subsequently  normalising  with  R  (6  ) 


we  have 


3S 

0  =  R“,j(6°)Sn(B°)  +  R"%(B°)9eI1(6*)(Bn-B0) 


'98 


(2.12) 


where  6*  is  an  intermediate  point  between  B  and  $  .  Again, 
n  n 

reasoning  as  in  the  proof  of  Theorem  2.2  of  Klimko  and  Nelson  (1978), 
in  the  limit  as  n  -*■  00  we  may  replace  3*  by  8°.  Moreover,  using  DN1 , 
the  boundedness  condition  CN2  and  the  orthogonal  increment  property 
of  a  martingale  difference  sequence,  it  follows  from  Chebyshev's 


inequality  that  there  exists  an  n^  such  that 

Fn(B°)  i  R-V)  I  b^B0)  -  El^-CB0)  I  F*_J  J 

t=m+l 


(2.13) 


is  bounded  in  probability  for  n>n0<  Since,  from  Theorem  2.1,  Bn  -v  B  , 

it  follows  that  Fn(B°) (Bn-B°)  -►  0,  and  therefore,  when  taking 

0 

distributional  limits  in  (2.12^,  RnJ(B  )9Sn(B*)/3B  may  be  replaced  by 


R^(6°iL+1E ^^i(8°),FtX-4  =  Rnb(B°)Tn(60) 


(2.14) 


and  hence  from  (2.10)  and  (2.12),  the  theorem  will  be  proved  if  we 

j  o  0  d 

can  prove  that  R~2(3  )Sn(B  )  ■*  N(0,Ir). 

We  use  a  Cramer-Wold  argument.  For  an  r-dimensional  vector  X 
of  real  numbers  it  is  sufficient  to  prove  that 


X  Rn  (B  )Sn(B  )  -  W(0,X  X) .  (2 

For  this  purpose  we  introduce 

«t„  =  ■  XW  V 

j  n  q 

Then  XTR~  2Sn  =  I  £tn,  and  for  8=8  we  have  that  £tn>  m+l<t<n,  are 

martingale  increments  for  a  zero-mean  square  integrable  martingale 
i 

array  .1.  =7  £  ,  m+l<i<n.  It  is  then  sufficient  to  verify  the 

7  in  ,  tn  7 

t=m+l 

following  conditions  (cf.  Hall  and  Heyde  1980,  Th.  3.2,  where  the 

nesting  and  integrabil ity  conditions  of  that  theorem  are  trivially 
0 

fulfilled)  for  8=8  : 

P 

(i)  max  |£  |  -+•  0 

m+l<t<n 

n  a  P  t 

(U)  }  *  x  x 

t=m+l 


(iii)  E(  max  £  )  is  bounded  in  n. 

m+l<t<n  n 

The  condition  (ii)  follows  trivially  from  the  definition  of  £^ 
and  the  assumption  DN2.  Moreover, 

max  d  <  l  C  =  *Tr_!1  l  CJjIirV  (i 

tn  —  r  tn  n  ‘i  t  t  n  ’  v 


m+lstsn 


t=m+l 


t=m+l 


and  using  the  definition  of  R^  in  (2.8)  we  have  that  the  expectation 

T 

of  the  extreme  right  hand  side  of  (2.17)  is  X  X,  and  (iii)  follows 
from  this. 

Also,  using  the  technique  described  in  Hall  and  Heyde  (1980, 
p.  53),  for  a  given  e>0 

P  (  max  |fj  >  O  ■  P  (f  <nKlet„l  >  e)  >  e2)  (2. 

m+l<t<n  t=m+l 

where  l(’)  is  the  indicator  function.  But 


(2.19) 


l  .E  5tn1(,etJ>e> 

m+1 


-I 


t=m+l 


X  R_1E 
n 


W<lx  VVA2^1 


and  using  the  definition  of  z,  and  the  conditions  CN1  and  DN1  we 


have  that  for  a  given  6>0,  there  is  an  n^  such  that  for  n>nQ  and  all  t, 


m+l<t£n, 


E< 


> 

T 

,T  ~h  T  1  1 

CTT 

A  R  2c  r  R  2X  >e 

t  t 

n  tt  n  j 

<  6 


(2.20) 


for  8=8  .  Again  using  CN1  and  DN1  there  exists  an  n^  such  that 


|R  *..(S^)|  _<  kn  1  for  n^n1 ;  i,j=l,...r,  and  for  some  constant  k.  Let 


n'  =  max  (n-.n^).  Then  from  (2.19)  and  (2.20)  we  have  for  8=8  and  for 


n>n ' 


l  E(c!n  l(|c  I  >e)}  <  K(X,k)  6, 

t=n ' 


(2.21) 

where  K(A,k)  is  a  constant  depending  on  X  and  k  but  independent  of  n. 


On  the  other  hand,  using  CN1 ,  DN1  and  (2.19)  it  follows  at  once  that 
0 


for  8=8 


n 


l  ^n'Oetni  >  *»  *  0 


t=m+l 


(2.22) 


as  n  +  «,  Using  Chebyshev's  inequality,  (2.21)  and  (2.22)  now  implies 

(i),  and  the  proof  is  completed.  || 

The  matrix  R^  corresponds  to  the  number  of  observations  in  the 

statement  of  Theorem  3.2  of  T1 .  In  the  stationary  ergodic  case 
-1  -1  a-S' 

n  Rn  R  and  n  T  -*•  U  as  n  -»  <»,  where  U  and  R  are  given  by  (3.6) 
and  condition  D1  of  T1 ,  and  it  is  seen  that  (2.10)  reduces  to  (3.18) 
of  T1  then.  However,  in  the  nonstationary  case  we  do  not  require 
the  convergence  of  n'^R^  and  n  ^Tn,  and  in  fact  for  the  examples  to 
be  treated  in  the  next  section  these  quantities  do  not  always  converge. 


If  T  -U  -*  0,  where  l)  =H(T  ),  then  the  asymptotic  covariance  matrix 
n  n  n  n 

-1  -I 

of  S  is  given  by  U  R  U  ,  which  tends  to  zero  by  Theorem  2.1. 
n  &  J  n  n  n 

3 .  Exampl es 

As  in  T1  we  will  illustrate  our  general  results  on  a  variety 
of  nonlinear  time  series  classes.  The  technical  difficulties  are 
larger  than  in  the  stationary  ergodic  case,  and,  partly  to  display 
the  essential  elements  involved  more  clearly,  we  will  confine  our¬ 
selves  to  discussing  scalar  first  order  AR  type  models.  Intensions 
to  higher  order  and  vector  models  will  be  relatively  straightforward 
in  some  of  the  cases.  As  for  the  examples  in  T1  we  will  generally 
omit  the  superscript  0  for  the  true  value  of  the  parameters. 

3.1  Threshold  autoregressive  processes. 

These  models  were  originally  introduced  by  Tong  (1977)  in 
connection  with  the  analysis  of  river  flow  data.  The  underlying 
idea  is  a  piecewise  linearization  of  the  model  by  introduction  of 
a  local  threshold  dependence  on  the  amplitude  X  .  In  the  nomen¬ 
clature  of  Tong  and  Lim  (1980)  a  scalar  SETAR  (m,p,...,p)  model  is 
given  by 


X,  -  l  a?  X  .  =  e^ 

t  i  t-i  t 


(3. 


T 

for  [X.  X^  ]  €  F . , i=l , . . . ,m,  where  F,  ,...,F  are  disjoint 

l  t_1  t-p  j  1  m  J 

m 

regions  of  the  p-dimensional  Euclidean  space  R^,  such  that ^U^F^R^ 


Moreover,  {e^},  j=l,...,m,  are  independent  white  noise  series 
consisting  of  independent  identically  distributed  (iid)  variables. 
Tong  and  Lim  (1980)  consider  the  numerical  evaluation  of 


maximum  likelihood  estimates  of  the  parameters  of  the  threshold 
model.  In  a  reply  to  the  discussion  of  their  paper  they  also  mention 


t-V. 
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the  possibility  of  applying  the  theory  of  Klimko  and  Nelson  (1978) 
to  study  the  properties  of  these  estimates,  but  we  are  not  aware  of 
any  actual  work  in  this  direction. 

We  will  only  treat  the  first  order  AR  case  (p=l  in  (3.1)), 
and  we  will  assume  that  there  is  only  one  residual  process  {e  } 
consisting  of  zero-mean  iid  random  variables.  We  can  then  write 
(3.1)  as 


t  - J'ViVVd  ■  et 

j=l 


(3.2) 


where  this  equation  is  supposed  to  hold  for  t>2  with  as  an  initial 
variable,  and  where  H.(X^  ,)=  1(X  ,  e  F.),  l(-)  being  the  indicator 

jt-1  t-1  j  v 

function.  There  is  no  explicit  time  dependence  in  (3.1)  and  (3.2). 

The  reason  that  we  did  not  treat  such  processes  in  connection  with 
our  study  of  stationary  processes  in  T1 ,  is  that  we  have  not  been 
able  to  prove  the  existence  of  an  invariant  stationary  distribution 
for  the  initial  variables  in  the  threshold  case  (cf.  Sec.  4.1  of  Tl) . 
For  a  general  initial  variable  X^  it  is  clear  that  the  process 
generated  by  (3.2)  will  be  nonstationary. 

Theorem  3.1 :  Let  {x^}  be  defined  by  (3.2).  Assume  that  the  threshold 

regions  F.  are  such  that  there  exist  constants  a>0  so  that  for  all 

t,  E{X2H.(X  )}  _>  a.,  j=l,...,m.  Moreover,  assume  that  |  aJ  |  <  1, 
t  J  t  j 

4  4 

j=l,...,m,  F.(Xj)  <  °°  and  E(et)  <  ■».  Then  there  exists  a  strongly 
consistent  sequence  of  estimators  { a^ }  =  {[a  , ....a^]  }  for 
a=[a\..|am]T  These  estimates  are  obtained  by  minimizing  the  penalty 
function  of  (2.2),  and  they  are  jointly  asymptotically  normal. 
Proof:  The  system  of  equations  9Qn/9a’'=0,  j  =  l , . . .  ,m,  is  linear  in 
a*,...,am,  and  it  is  easily  verified  that  is  minimized  by  taking 


1 


'v=2xtxt-l  yxt-l> 

a  ^  —  — 


(3.3) 


L  K,  H.(X,,) 


w t-1  j  t-1' 


where  this  exists  with  probability  one  since  E{XZH.(X  )}  >  a.. 

t  j  t  —  j 


Using  (3.2)  and  the  independence  of  the  e  ' s  we  have 


m 


=ViVst-i>' 


(3.4) 


while  higher  order  derivatives  are  zero.  Also,  it  is  easily  shown 


that  *  E  <xt-xtit-i)2lFJ.i)  *  E<et>  ■  °2-  Since  3Xtit-i/3aj 


does  not  depend  on  a  ,  k=l,...,m,  it  follows  that  CN2  and  CN4  of 


Theorem  2.1  are  trivially  fulfilled.  Moreover,  using  |a^|  <  1, 


j  =  l , . . .  ,m,  E(X^)  <  °°  and  E(e^)  <  it  follows  from  (3.2)  that  E(X^)<K 


for  some  constant  K,  and  that  CN1  of  Theorem  2.1  holds. 


From  the  special  structure  of  the  derivatives  given  in  (3.4) 


we  have  that  the  matrix  A  in  (2.3)  in  the  present  case  is  a  diagonal 


matrix  and  is  given  by 


A"  -  diae<y2Xt.iVxt-i)1 


(3.5) 


2 

and  using  the  assumption  E{X  H.(X  1}  >  a.  we  have  that  CN3  of  Theorem 

t  3  t  —  J 


2.1  will  be  fulfilled  if  we  can  prove  that 


i  v  a 


-l  X:H.(XJ  -  -  l  E{X“H.(X.)} 

n  £=1  t  v  n^j  tjvt2 


(3.6) 


for  j=l,...,m.  This  will  also  be  the  key  relationship  used  in  the 


proof  of  asymptotic  normality. 


From  the  strong  law  of  large  numbers  we  have 


n  »  «  s. s , 

r  l  £ 


1  r  2 

-  )  e.  -c 

n^,  t 


(3.7) 


■  ■  »  '  «  •  •  •  J  /  /  -  if.  .  *  •  *  %  «.  * 


■*.  -  .  • . 


0  -v •Iv.v.y  v 


A.'  .  • •  V  v' 


3 


Ew  n 


n  -*■  00 .  Since  H. (X  )H . (X  )  =  6 .  .H .  (X  )  ,  it  fol  lows  from  (3.2) 
i  t  j  t  ij  i  t  v  ’ 


2  2 
et  -  Xt 


m  m  .  -  . 

.H"')  (3.8) 


Again  inserting  from  (3.2)  we  have 


i/Vt-iWP  -,Ha  >  xt-iVxt-i>  5  V^VlWi’  <3-9) 

However,  it  is  easily  checked  that  {U^}  is  a  martingale  difference 

X  2  m  i  2 

sequence  with  respect  to  {F;},  and  since  E(lr)  =  o2e[-  £  a-'X  H.(X  )  ■  ]<K 

J=1  3  t_1  1 

for  some  constant  ,  it  follows  from  the  strong  law  for  martingales 

. n  a.s. 

(Stout,  1974,  Th.  3.3.8)  that  n  £  U  -*■  0.  Inserted  in  (3.7)  and 

t=2  C 

(3.8)  this  yields 


i  n  n  *  n  ni  »  ^  »»  3 .  s  • 

jfl  xt4J  J  <a  )  xt-lHj(x t-l>  •  0  "  0  u 

t=2  x  t=2j=l  J 

g  5 

Since  E(X2)  <  K  for  all  n,  we  have  n-1X2H.(X  )  -*■  0  as  n  00  for 

n'  ’  n  j  n 

m  J 

j=l,...,m.  Furthermore,  since  1  =  £  H.(X  ),  an  alternative  way  of 

j=l  J 

writing  T3.10)  is 


(3.10) 


ro  •  9  1  n  9  ^  a .  s . 

i(l-(aJ)2}^XH.(Xt)  -o  - 
j=l  t=l  3 


(3.11) 


On  the  other  hand,  since  EfU^)  =  0  in  (3.9),  taking  expectations  in 


(3.8)  and  (3.9)  and  adjusting  the  summation  index  as  above  we  have 


I  U-(aJ)2}.L  l  E(X^H  (X 
j=l  t=l  3 


2H.(X  )}  -  o2  -*■  0 


(3.12) 


as  n  +  00 .  Combining  (3,11)  and  (3.12)  it  follows  that 


•  7  i  n  ?  |  ^  ^  ■i"?  a.s. 

{l-(aJ)Z}[i  l  xV(X  )-I  l  E(x;h  (X  )}]  A  l  (l-(a^)  }Y.  -  0  (3.13) 

,  1  nt=l  c  3  z  t=l  z  3  J=1  3 

The  zero-mean  random  variables  Y-n,  j=l,...,m  arc  linearly  independent 


*  *  ’  s'  *  ^  N’  *  ’  »  *  •.*  -  *  ’  •  -  •  *  %*  •  '  •  *  O  »  *  -  *  ■»  ’  ■  •  *  .  •*.  *  *  .  •*.  *.  *.  ■* .  •*,  *%  •*. 


tv 


13 

for  each  fixed  n.  Since  by  assumption  | j  <1  for  j=l,...,m,  the 
relationship  (5.6)  follows  from  (3.13),  and  this  proves  the  consistency 
part  of  the  theorem. 

Turning  now  to  the  proof  of  asymptotic  normality,  it  is  not 
difficult  to  verify  that  the  matrix  defined  in  (2.8)  in  the 
present  case  is  given  by 

o  n  2 

Rn  =ofc  diag[  ^E{X^1H.(Xt  l)}], 


(3.14) 


and  using  the  assumption  E{Xt  jH^(Xt  p}  >_  ou  for  j=l,...,m  it 
follows  at  once  that  DN1  of  Theorem  2.2  is  fulfilled.  Moreover,  the 


matrix  in  DN2  is  seen  to  be  given  by 

n  2  2 


Dn  =  b  diag 

a 


.l/tViV’W 


El 


i,2xLi'Vxt-d 


(3. IS) 


Since  E(e^)  <  °°  and  |a^|  <  1,  j=l,...,m,  there  exists  a  such  that 
E{X«  jHj(Xt  j)}  £  K?  for  all  j  and  t,  and  thus,  using  that  et  is  indepen¬ 
dent  of  F*  r  we  have  EtteJxJjHj  (xt_j)  )2]  1  K2E(eJ) .  From  the  mart¬ 
ingale  strong  law  applied  to  the  martingale  difference  sequence 
{eV  .)  -  o2X2  ,H.(X+  .)}  it  follows  that 

t  t-1  j  t-1  t-1  j  t-1 


1  r  2  2  a2  r  2 


a.  s . 

->  0. 


(3.16) 


Using  (3.6)  and  an  addition-subtraction  argument  in  (3.15)  it  follows 
a.s. 

that  D  -*  I  as  n  -*■  °°,  and  thus  from  Theorem  2.2 
n  m 


,.*/  -\im  V  /  V.  m\  A  -.’  V  •  *t  W/. /•“  •» 


>, 

y, 

» 

3 


3 

fj 

V 

•s 

a 


■-.i 

l 

I 


ul 


J2xt-lvxt-l> 

a[  l  E{\2  H  (X  )}]* 
t=2  x~  J  w 


(an-a)  -  ).||  (3.17) 


It  should  be  noted  that  we  have  asymptotic  independence  of  the 

A  • 

estimates  a^,  j=l,...,m,  in  the  sense  that  the  asymptotic  covariance 
matrix  is  diagonal.  Moreover,  taking  (3.6)  into  consideration  it  is 


seen  that  (3.17)  may  be  rewritten  as 

diag  (i  [  l  E{X^_1H.(Xt_1)}]S  -  N(0,y  (3.18) 

t=2  2  h  d 

which  reduces  to  the  familiar  formula  (nEfX^)}2/®  -*■  N(0,1)  in  the 

ordinary  (m=l)  stationary  AR(1)  case. 

The  conditions  stated  in  the  theorem  can  be  relaxed.  For  example 

it  is  not  necessary  to  require  that  the  e^'s  are  identically  distributed 

It  is  not  difficult  to  check  that  the  above  proof  applies  to  the  case 

2 

where  the  e^'s  are  independent  and  zero-mean,  and  where  m<E(et)<M 

4 

and  E(et)_<  M'  for  some  positive  constants  m,  M  and  M' .  It  should  also 

be  noted  that  a  similar  nonstationary  generalization  can  be  made  for 

the  exponential  autoregressive  model  treated  in  Section  4.1  of  Tl. 

2 

The  condition  EfX^H^.  (X^) }>^  or  will  be  satisfied  if  the  regions 

are  chosen  so  that  P  (X^  e  F.)  >  y.  >  0  for  some  positive  constants 

t  J  —  J 

Y  , ...,Y  ,  and  where  P  (X  =0)  ^  P  (X  e  F.  )  with  F.  being  the  region 

1  m  t  t  •’o  J0 

containing  0.  As  an  example  where  such  conditions  are  satisfied 

consider  the  case  where  (X^ ,e2,e^, . . . )  are  iid  standard  normal,  where 

there  are  only  two  regions  F^=  {x :  x<0}  and  F^=  (x:  x >0 } ,  and 

where  a*=0  and  a2 =h-  Then  P  (Xt>0)  >_  4,  while  P  (Xt<0)  y  for  some 
2 

Y>0,  since  E(Xp  is  uniformly  bounded  in  t. 


r.  -  »  •  ■  •  •  \  M  «  ^  „  l  J  -  k>  k  •  k 


3.2  Random  coefficient  autoregressive 


These  processes  were  treated  in  considerable  generality  in  the 
stationary  case  in  Sections  4.2  and  6  of  T1 .  Here  we  will  restrict 
ourselves  to  a  scalar  first  order  model.  Extension  to  higher  order 
vector  models  involves  the  same  principles,  but  is  notationally  more 


complex. 


assume  that  {X^}  is  given  on  -<»<t<Q0  by 


xt  -  xt.i  ■  et 


(3.19) 


where  {e^}  and  {b^}  are  zero-mean  independent  processes  each  consisting 

2  2 

of  independent  variables  such  that  m^  <_  E(e^)  and  Efb^)  <_  M^, 

2 

where  m^.M^  and  are  positive  constants  such  that  a  +M^<1 .  These 

b  g 

conditions  guarantee  that  there  exists  a  F  vF  -  measurable  solution 
of  (3.19)  with  uniformly  bounded  second  moments.  This  solution  can 


be  expressed  as 


s< 


(3.20) 


with  a  .  =11  (a+b^  .)  and  where  by  definition  a„=l. 

ti  j=0  t-j  to 

We  consider  the  problem  of  estimating  the  parameter  a.  Since 

Xt|t  j  =  aX^.  it  is  clear  that  there  is  a  unique  solution  to 

a  n  n  2 

3Q_/9a  =  0  with  Q  as  in  (2.2),  namely  a  =  (£  X  X  t X  )  assumin 
n  n  n  2  ^  '2  ^ ^ 

that  observations  (X, ,...,X  )  are  available.  It  is  our  task  to  find 
v  1  n 

the  properties  of  this  estimate. 

i  4 

Theorem  3.2:  Let  {X^}  be  as  above.  If  in  addition  E(Xt)  £  K  for  some 

/A 

constant  K,  then  a  -*•  a. 

n 

Proof:  It  is  easily  seen  that 

E,«t-xt|t-i’2iFtX-i'  =  htxt2-i**t  <3 


It. 


where  h  =  fi(b')  and  g  -  F(e~).  Thus 
t  t  st  v  t 

Kx  ,  v  ' 

E  -4^1  fta-J  ■  *  *,> 

9a  1 

(l  j  J 


(3.22) 


is  uniformly  hounded  in  t  and  C'N  1  of  Theorem  2.1  is  satisfied.  The 
conditions  CN2  and  CN4  are  trivially  satisfied  since  3  X ^  j  ^  ^/3a=Xt  ^ 
is  independent  of  a.  It  remains  to  check  CN3.  This  can  be  done  using 
martingale  techniques  analogous  to  those  used  in  the  proof  of  Theorem  3.1. 

From  (3.21)  we  have  that  {W^}  =  {(b^X^  ^  +  e^)^-(htX^  ^  +  g^) } 
is  a  martingale  difference  sequence  with  respect  to  {F  }.  Using  our 

4 

independence  assumptions  and  the  fact  that  uniform  boundedness  of  E(X^) 

4  4 

implies  uniform  boundedness  of  E(bt)  and  E(et)  we  have 

E(W^)  =  Efb^ECX^j)  +  4htgtE(X^_1)  +  E(eJ)  -  h*E(xJ_j)  -  gj  <  Kj  (3.23) 

for  some  constant  .  From  the  martingale  strong  law  we  have 
, n  a.s. 

n  V>t  -*•  0  as  n  -*■  °°,  and  thus,  since  btXt_j  +  et  =  X^-aX^  ^  ,  we  have 
!  n  «  «  n  2  n  ^  *  n  « 

{Ij-  %,*'*'.!  *  rjJ-r  {lyli  - 

On  the  other  hand,  {Vt}  =*  ((X^-X^  jJX^}  also  forms  a  martingale 

X  2 

difference  sequence  with  respect  to  { }  with  EfV^)  _<  C 2  for  some 
constant  C2  and  thus 


n  _  a.s. 
c  , 


-  I  XX  -  -  l  x:  . 

nt=2  1  t_1  n  t=2  t_1 


(3.25) 


2  -12 
Since  E(Xp  is  uniformly  bounded,  we  have  n  X^  -*■  0  as  n  -*■  °°,  and 


combining  (3.24)  and  (3.25)  it  follows  that 

1  v  u  v2 


n  a.s. 


Xt-.  -ilvlr  S  A*t 4 '° 


.V .-.-  v//- 


.•■-.v.v.v.v.v.vN-.v.v.v 


y.4‘>v:->v-v 
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Here  and  |a|  <  1.  Hence 


1  5  2  7  _1  lr  mi 

lim  inf  -  l  Xt  J  >  (1-a")  lim  inf  -  l  gt  _> - 


(3.26) 


n  ->  oo  t=2 


n  -*■  00  t =2 


and  because  3xt|t  ^/3a  =  Xt  l*  ^is  shows  that  CN3  of  Theorem  2.1  is 
fulfilled  and  the  theorem  is  proved.  || 

It  should  be  noted  (cf.  Theorem  4.2  of  Tl)  that  in  the  stationary 

2  ~ 
case  E{X“}  <  °°  was  sufficient  to  guarantee  strong  consistency  of  a^. 

4 

The  condition  E(X^)  <  K  used  in  the  present  theorem  will  be  satisfied  if 
4  ,  4, 

E(et)sC^  and  E((a  +  b  )  }  £  <  1  for  two  constants  and  C It 

2 

was  needed  to  obtain  a  uniform  bound  on  E(Wt)  in  (3.23).  Using 
Corollary  3.3.5  of  Stout  (1974)  it  is  possible  to  weaken  this  to  requiring 
a  uniform  upper  bound  on  E{ | Wt | (log+ jw^ | ) 1+e}  for  some  e  >  0. 

As  can  be  expected  by  analogy  with  the  stationary  case  treated 
in  Tl ,  boundedness  conditions  on  higher  order  moments  are  needed  to 
ensure  asymptotic  normality. 


Theorem 


3.3:  Let  {X^}  be  as  in  Theorem  3.2.  If  in  addition  E(e^)  < 


O 

and  E{(a+  b^)  }  £  Cj  <  *  ^or  two  constants  ancl  c2’  t*ien  an  *s 
asymptotically  normal. 

Proof :  Using  (3.22)  it  is  seen  that  the  quantity  Rn  of  (2.8)  is 
given  by 

Rn  '  j2E(Xll(htXt-l  **t»-  1 

2  2  2  2 

In  view  of  (3.20)  we  have  gtE(Xtl)  ^  g^e^)  =  —  "’l  ’  and  at 

follows  that  DN1  of  Theorem  2.2  is  fulfilled. 


(3.27) 


Employing  a  subtraction-addition  argument  and  the  definition  of 
the  quantities  used  in  DN2  it  is  clear  that  DN2  will  be  fulfilled  if 
we  can  show  that 
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i  n  ? 

1  V  v2 


y2xti<btxt.i  ♦  v2  -  jj,E(xt-i  <h.x;'-i*  8,>>  SiS'°  <3-*» 

This  will  be  done  in  several  steps. 

First  observe  that  the  stated  moment  conditions  on  e  and  b 

g 

imply  by  (3.20)  that  EfX^)  <_  Kj  for  some  constant  .  Clearly 

{2btetX^_1)  is  a  martingale  difference  sequence  with  respect  to 
X  6 

(F  },  and  since  E(Xt  ^)  for  some  K^,  it  follows  from  the 

martingale  strong  law  that 

5  j22btetxt-i  '  °-  <3-29> 

Similarly  {X^  ^(e^-g^)}  anc*  ^t  l^t ”^t^  define  martingale  difference 

g 

sequences  and  using  E(X°)  _<  and  the  martingale  strong  law  we  have 


(3.28) 


(3.29) 


1  ?  „2  2  1  r  v2 


il2<- t  - ° 


(3.30) 


1  v  4  2  1  v  4  a’S' 

-  I  h Z  -  -  l  K  ,h.  -*•  0. 

nt_2  t  nt=2  * 


(3.31) 


Inserting  (3.29)  -  (3.31)  in  (3.28)  it  is  seen  that  to  prove  (3.28) 


it  is  sufficient  to  prove 


i  n  ^  n  ci  •  s  • 

S  1  <Vl  -  •*  0 


(3.32) 


»  •  t  *  ■  ”’**  *  • 


«  n  *  i  3 » s « 

; l  'x.-i  -  E<xt-i>>ht  "  0 


(3.33) 


2  2 

Let  Y  =  {X^  -  E(Xt))ut  where  (u^.)  is  a  positive  deterministic 
sequence  bounded  above  by  some  constant  k  and  consider  the  sequence  of 
a- fields  {F^,  -»  <  t  <  where  F^  =  FtvFt<  We  will  prove  that 
{Y^}  is  a  mixingale  difference  sequence  with  respect  to  {F^},  i.e. 


\  v.v.  .-.v.  v  ■ 
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(cf.  Hall  and  Hcyde  1980,  p.  19)  that  for  sequences  of  non-negative 

constants  c  and  ij;  ,  where  ^  -*■  0  as  s  -*•  we  have 

t  s  s 

(3 . 34i) 


E[(E(Yt|Ft_s)}2]< 


and 


E[(Vt-ECYt|F„s)}»]  <^icj 


(3.34ii) 

for  all  t  >  1  and  s  >  0.  Since  is  F^-measurable,  the  condition  (ii) 

in  the  definition  of  a  mixingale  is  trivially  fulfilled  for  any  choice 

of  c  and  Ui  . 
t  s 


(3. 35) 


Using  (3.20)  and  independence  properties  we  have 

E<xt>  *  bt-P2)l8t-i 

i=()  1=0  j  =0  3 

and  for  s  >  0 

%%** * bt-i,2,38t-i*ifs  <3-*» 


However,  for  i,j  s?s  we  have 

i-1  j-1 


s-1 


(3.37) 


E(atiatj|Ft-s)  %n  ^t-k5  11  (a+bt-m)E{vnn(a+bt-k^  > 
k=s  .  ,  m=s  k=0 

l-l 

where  by  definition  IT  (a+b  .  )  =  1  for  >=i.  Combining  (3.35  -  3.37)  we 

k=s  t_K 

obtain 


s-1 


OO  OO  i-1 
\ 
i 

r 


j-i 


|E(YtlFt.s)l  ■  lutE(  n  (a.bt.k)  )[  I  l  J 

k=0  i=s  i=s  k=s  3 


i-1 


-  I  E(  n  (a+b  )z}g  ]|<  k(a%M  )5|{x;  -E(X‘  J}|  (3.38) 

i=s  k=s  t_K  t_1  ~  J  t-b  t_s 

and  since  E(X^)  £  K  and  a^+M£<l  it  follows  that  { Y  }  is  a  mixingale 

difference  sequence.  Moreover,  it  follows  from  the  mixingale  convergence 

theorem  (Hall  and  Heyde  1980,  Th .  2.21)  that  (3.32)  holds  by  choosing 

ut  =  gt+i- 


i 


H 


8  c—j 


% 


Next  let  =  1X^  -  F.(Xt)}ut.  We  will  show  that  (Z^.F  }  defines 
a  mixingale  difference  sequence.  Again  the  condition  (ii)  in  the  defi 
nition  of  a  mixingale  sequence  is  trivially  fulfilled.  Using  (3.20) 
and  independence  properties  we  have  for  s  >  0 


s-1  s— 1  s-1  s-1 


E(xt|Ft-s)  =1111  EVi  ati  ati  ati  et-i  Vi  Vi  Vi  ) 

i  =0  i  =0  i  =0  i  =0  C11  Z12  Z13  Z  4  Z  1  2  Z  3  4 


12  3  4 


oo  oo  oo 


y  y  y  E(a  .  a  .  a  .  a  .  F  )e  .  e  .  e  .  e 

^  ^  ti.  tl,  ti,  ti.  t-s  t-i,  t-i~  t-i,  t-i, 

=s  i,=s  1  =s  i  =s  1234  1234 


1  x2  J  3  4 


s-1  °°  00 


3  y  y  y  g  E(a  .  a  .  a  .  |F  )e  .  e 
•  r,  •  -  V  t-i,  ^  ti,  ti,  ti*  t-s'  t-i,  t-1 


•  fl  ■  •  t-1,  tl..  tl,  tl  .  ' 

1=0  i  =s  l  =s  1  134 

13  4 


3  4 


s-1  00 

3  l  I  E(Vi  )  E  (a*t  a  (F  )et  ) 

ix=0  i4=s  L  X1  Z11  tx4  L  S  4 


(5.39 


A  corresponding  splitting  up  of  E(X^)  yields 


s-1  s-1  s-1  s-1 


J  J  t  ^  j.  * 

E(\j  =  J  J  y  y  E(a  .  a^.  a  .  a  e  .  e^  .  e  .  e  .  ) 
1  i,  =0  i.=0  i.=0  i  -0  “l  tl2  “3  Cl4  '"h  ^  ^ 


12  3  4 


OO  OO  OO  oo 


i-  y  y  y  y  Eta,.,  a  .  a„ .  a  .  e  .  e„  .  e  .  e.  .  ) 

.L  .L  .L  L  ti1  ti,  ti,  ti.  t-i,  t-i.,  t-i,  t-i  ' 

i  =s  i  =s  i,=s  .  12341234 

123  l  =s 
4 

s— 1  00 

1  3  l  l  Vi  Vi  E(ati  VV 

i  =0  i  =s  1  1  x  4  V  X14 

1  4 


(3.40 


Since  for  i^.^VV  iL  s»  we  have  that 


yv>:.>:v;v;v>v:w 


it  follows  that 


OO  OO  CO  CO 


T  y  Y  y  E(a  .  a^ .  a  a  |F  ) e  .  e  .  e  .  c  , 

L  .L  .L  .L  ti,  ti~  ti,  ti.  t-s  t-i.  t-i.,  t-i,  t-i. 


.*■  .  **  .L  tl,  tl-  tl-  tl. 

i  =s  i  =s  1  ,=s  i  =s  1234 
12  3  4 


1  *■  2  "3  4 


44  s/2  4 

=  Ei  n  (a+b  .)  *!  <  X* 

.  „v  t-r  t-s  —  2  t-s 

1=0 

Similarly  for  i^  £  s-1  and  i^,  i4  2l  s  we  have 


(3.42) 


E(a  .  a^ .  a^ . 
C11  tl3  ll4 


IF  )  =  I!  (a+b.  .)  H  (a+b  )E(  H  (a+b  .)  n 
t-s  t-i  .  t-jJ  .  t-i  .  . 

J=s  J  J=s  h=0  J  J=1X 


K:  a  IF  )  =  n  (a+b  )E^  n  (a+b  )H  H  (a+b  ) 
T  1  T14  t_S  j=s  3  j=0  J  j=i1  J 


It  follows  that 


s-  1  00  00 


y  y  y  K  •  E(a  .  a  .  a  .  IF  )e  .  e  . 

>_a  A-  /®t-l.  tl,  tl,  tl'  t-SJ  t-1,  t-1 


•  r.  •  •  t-1,  tX,  tl_  tl. 

1=0  1,=S  l  =s  1  1  34 

13  4 


3  4 


2  J2 


-  I  8t_i  E  n  (a+b  )’  n  (a+b  )  x  5  m]Sc2'  Xt_ 

ij=0  1  J-0  J+ij 

7 

Moreover,  since  ]E(e^)|  <  for  some  constant  M^, 


+s/4  2 


s-1  00 


X  „  X  E<<i> c  JFt-5)et-i. 

lj=0  !4=s  1  14  4 


(3.44) 


(3.45) 


l  E(eJ_i  )  E  "  (a+b  )4  IT  (a+b  )  X  <  M  sC^  |X  | 
ij=o  h  j=0  J  j=ij 


a 


Reasoning  in  an  entirely  analagous  way  for  the  two  last  terms  on  the 
right  hand  side  of  (3.40)  and  inserting  in  (3.39)  and  (3.40)  we  have 


iut[c2/2fxL  *  e<cs» 


+  M,sC*/4{X?  +  E(xf  )}  +  M„sC^8 1 X 

1  2  t-s  t-s  2 


t-s 


] 


(3.47) 


,8, 


Since  k,  E(Xt)  £  and  <  1,  it  follows  from  the  mixingale 
convergence  theorem  with  u  =h  ^  that  (3.33)  holds,  and  the  theorem 
is  proved.  |  | 

Again  it  should  be  noted  that  in  the  stationary  case  (Tl,  Th .  4.3) 


E(X^)  <  «>  is  sufficient  to  guarantee  asymptotic  normality  of  a^, 

,8. 


while  in  the  present  case  we  require  E(X^)  £  K. 

3.3  Doubly  stochastic  processes. 

Random  coefficient  autoregressive  processes  are  special  cases 
of  what  we  have  termed  doubly  stochastic  time  series  models  in  Tjtfstheim 
(1983,  1984b).  ITi  the  simplest  first  order  case  these  are  given  by 

(3.48) 


xt  =  Vt-1  +  et 


where  {a+  b  I  of  (3.19)  now  is  replaced  by  a  more  general  stochastic 


process  {0^}.  The  process  {6^}  is  usually  assumed  to  be  independent 
of  {e^}  and  to  be  generated  by  a  separate  mechanism.  Thus  {0^}  could 
be  a  Markov  chain  or  it  could  itself  be  an  AR  process.  We  refer  to 
TjjSstheim  (1983,  1984b)  for  a  definition  and  properties  in  the  general 


case. 

What  makes  doubly  stochastic  processes  especially  interesting, 
is  that  in  many  cases  it  is  possible  to  construct  recursive  forecasting 
algorithms,  and  for  the  case  where  (0  }  is  an  ARMA  process,  there  is 
a  close  connection  with  Kalman  type  dynamical  state  space  models 


(cf.  Harrison  and  Stevens  1976,  Ledolter  1981  and  Tjtfstheim  1983). 

This  type  of  processes  has  attracted  considerable  attention  lately, 

and  there  exist  procedures  (see  e.g.  Ledolter  1981)  for  computation 

of  unknown  parameters,  but  as  far  as  we  know  there  are  no  results 

available  concerning  the  properties  of  these  estimates. 

We  will  only  consider  a  very  special  case,  namely  the  case  where 

{V  is  a  first  order  MA  process  given  by 

et  =  a  +  et  +  betl,  (3.49) 

where  consists  of  zero-mean  iid  random  variables  independent  of 

2 

{et}  and  with  E(e  )  <  00 .  Both  {e^}  and  {e^}  will  be  assumed  to  be 
defined  on  -00  <  t<  <®.  We  will  only  consider  the  estimation  of  a, 
but  we  believe  that  even  this  simple  problem  gives  a  good  illustration 
of  the  increase  in  difficulties  as  we  move  away  from  random  coefficient 
autoregressive  processes. 

To  be  able  to  construct  Kalman-like  algorithms  for  the  predictor 
Xt|t  1*  t*ie  Process  must  be  conditional  Gaussian  and  this  requires 

(Tjtfstheim  1983)  that  {e^}  and  {e^}  be  Gaussian,  and  that  there  is  an 
initial  variable  such  that  the  conditional  distribution  of  0Q 
given  Xq  is  Gaussian.  This  last  requirement  is  achieved  here  by 
choosing  X^  =  0.  Obviously  it  implies  that  {X^}  is  nonstationary. 
Theorem  3,4 :  Let  {X  ,  t>l)  be  given  by  (3.48)  and  (3.49)  under  the 

4 

above  stated  assumptions.  Assume  that  E(Xt)  £  K  for  some  constant  K, 
and  that  the  MA  parameter  b  is  less  than  1/2  in  absolute  value.  Then 

A  A  ^  »S  • 

there  exists  a  sequence  of  estimators  {a  }  such  that  a  -*•  a  as 

n  n 

A 

n-x®,  and  such  that  a is  obtained  by  minimization  of  Qn  in  (2.2)  as 
described  in  the  conclusion  of  Theorem  2.1. 


-  %  V  ■  V  \  V  \  V  *. 


Proof :  We  follow  Tj^stheim  (1983)  and  use  the  notation  m^ECbe^  ( F*)  and 
2  X 

Yt  =  E{bet-mt)  |F  >.  Then  it  is  easily  shown  from  (3.48)  and  (3.49) 
that  under  the  stated  assumptions  we  have 

xt]t-i  ■  (a  *  Vi)xt-r  __  (3.50) 

Moreover,  it  was  shown  in  Tjdstheim  (1983)  that  Xt|t  ^  can  be  obtained 
recursively  from  the  relations 


°!  *  xt-i(s2  4  Vi> 


Y2  A2 

„  _  u2^2  xt-l5  b 

Yt  "  b  6  “  2  2  fX2 

°  +  Xt-1(6  +  Yt-1} 


(3.51) 


(3.52) 


for  t  _>  1 .  Here  62  =  E(e2)  and  a2  =  E(e2)  ,  while  m0=E(b  e^)  =  0  and 
2  2  2  2 

Y()  =  E(b  e  )  =  b  6  .  It  follows  that  the  conditional  prediction  error  is  given  by 


ft|t-r  E{<Vxt|t-i>2lFtX-i}  ■  <«2  *  xt-i)x,2-i  4 

From  (3.50)  and  (3.51)  we  have 


(3.53) 


(i  x. 


(3.54) 


and,  since  y  is  independent  of  a. 


2  2  t-1 

5  bX^  I  ♦  — 1 — 

X  3a 


o2  +  X2  ,(<T  +  Y*  ,) 


(3.55) 


while  for  k  >  2 


3a 


+  Xt-i^ 


+  Yt-1} 


due  to  the  initial  condition  mQ=0.  (It  is  also  seen  directly  from 

(3.51)  that  m  depends  linearly  on  a).  It  follows  that  CN2  and  CN4 

of  Theorem  2.1  are  trivially  fulfilled. 

Since  y  >  0  we  have  from  (3.55)  and  the  summation  formula  for 
t-1  “ 

a  finite  geometric  series  that 


3m. 


3a 


<  b 


1  + 


3m 


t-1 


3a 


<  b 


1  -  1  b  | 

1  -  lb  I 


(3.57) 


and  it  follows  that  | 3m^  /3a |  is  uniformly  bounded.  Similarly  it 

2  2 

follows  from  (3.52)  that  Yt  £  2b  6  .  Using  (3.53)  and  (3.54)  it  is  now 

4 

seen  that  the  condition  EfX^)  £  K  implies  that  CN1  of  Theorem  2.1  is 
fulfilled. 

Since  we  assume  that  |b|  <  h,  we  have  by  (3.57)  that 

|  3m  /3a|  <  |bj  / (1  — } b | ) <1  and  thus  lim  inf  (1  +  3m  /3a)^>0.  From 
t  t  -*•  00  t 

(3.54)  and  the  form  of  CN3  it  is  clear  that  to  prove  that  CN3  holds, 
it  will  be  sufficient  to  prove  that 


-  n  2  ^  •  s  • 

lim  inf  —  J  X.  >  0. 

n  L  t 
n  -*•  00  t  =  i 

Note  that  with  our  initial  condion  Xq=0  wc  have 


(3.58) 


t-1 


(t  -X 


a*  • e* 

ti  t-i 


(3.59) 


i  .  =  II  0.  .  =  n  (a  +  e  +  b£  .  _) 
j=0  j=0  ^ 


(3.60) 


with  at0=1-  It  follows  at  once  that  E(Xp  >  a2,  and  thus  (3.58)  will 
be  proved  if  we  can  prove  that 


-  l  X?  -  -  l  E(X^)  ->  0. 
"til  1  "til  * 


a.  s. 


(3.61) 


This  will  be  done  by  using  the  mixingale  strong  law  of  large  numbers. 

Let  -  E(X^)  and  =  F^v  F^ .  Then,  is  F^-measurable 

and  condition  (3.34ii)  in  the  definition  of  a  mixingale  difference 
sequence  is  trivially  fulfilled.  Moreover,  we  have  from  (3.59)  and 
independence  properties  that  for  s  >  0 


E(xJ)  =  a2i 


s- 1 


t-1 


I  F.(a‘  )  ♦  l  E(a*  ) 
i=0  11  a— 


i=s 


(3.62) 


where,  by  definition,  the  first  sum  is  zero  for  s=0.  It  is  not  difficult 
to  show  that 

2  s-1  2  t-1 

E ( X  | F  )  =  a2  7  E(a^ . )  +  2  I  E(ak.a^  e_  . I Ffc  )e 
t1  t-s'  .Ln  tr  .Ln  v  ti  ts  t-i1  t-s'  t-s 

1=0  i=0 


t-1  t-1 


♦  y  y  E(a  .a  ,|F  )e  .e 

.L  a  ,L  ,  tl  tj '  t-S^  t-1  t-1 
1=S+1  J=S+1  J 


(3.63) 


For  i  <  s  we  have 


E{a  .  a  e  .  F  }  =  E[E(a  .a  e  . |F  .  .}|F  ], 

ti  ts  t-i  t-s  L  ti  ts  t-i1  t-i-1  1  t-sJ 


A 


3 


(3.64) 


On  the  other  hand  for  i  >  s 


i-1 


E{a  . a  e  . | F  }  =  e  .f  n  8t  t 
ti  ts  t-i1  t-s  t-il.  t-j 

j=S 


s-2 


<H0  6t-j1<"Et-s..*bEt-s’2iFt-s 


i-1 


=  e  (  n  9  .) 

t_1  j=s  t-3 


2  2 


s--' 


a  +2abet_s+b  e  t_s)E(  .)  +2(a+bet_s)  Ej  ( .n^  .) 


s-2 


+  n 


s-2 


(  II  0  .)r“ 

j-o  t-s+1 


i-1 


=A  ct-i(j!isVj)K(t*s)  • 


and  hence 


t-1 


7  E(a  .a  e  . | F  ) e  =  e  X  K(t,s). 

L  ti  ts  t-c  t-s  t-s  t-s  t-s  ’ 

i=0 


Using  similar  arguments  it  is  not  difficult  to  show  that 
t-1  t-1 


J  ,  .1  ,E(atiatj|Ft-s)et-iet-j  ■  Xt-SIC(t-s) 
i=s+l  j=s+l  J  J 


Inserting  in  (3.62)  and  (3.63)  we  have 


E(Vt|Ft_s)  ,  E(Xt2|Ft.s)  -  E(X2)  -  (2et.sXt.s«2_s)K(t,s) 


(3.06) 


(3.67) 


(3.68) 


-  a2E 


s-l  t-1  i-1 

f(  n  e*  .)  I  (  nej  ) 

i  j  =0  J  i=s  j=s  J 


(3.69) 


Since  E(X  )  <  K,  for  some  K  ,  it  follows  from  the  proof  of  Theorem  4.2 
t  1  1  s-2 

of  Tjf&stheim  (1983)  that  there  is  a  positive  g<l  such  that  E(  IT  0  ^  .) 

j=0  3 


c  _  1  4 

=  0(g  ).  Since  E(X^)  <  K  implies  the  existence  of  constants  K2  and 

such  that  E(e^)  <  l<2  and  F.(e^)  <  Kj,  it  follows  from  (3.66)  and  (3.69) 
that  E[  { E  (Y^  |  s)}2]  =  0(g2^S-1^).  We  then  have  from  the  mixingale 

convergence  theorem  (Hall  and  Heyde  1980,  Th.  2.21)  that  (3.61)  holds 
and  the  theorem  is  proved.  | | 
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The  condition  DN2  of  Theorem  2.2  is  not  easy  to  work  with  for  the 
present  example  and  we  have  not  ventured  to  prove  asymptotic  normality. 
3.4  Autoregressive  models  with  deterministic  time  varying  coefficients 
Autoregressive  models  with  deterministic  time  varying  coefficients 
have  found  applications  in  several  areas,  in  particular  in  speech 
recognition  (cf.  Markel  and  Gray  1977) ,  and  it  is  of  interest  to 
develop  a  theory  of  inference  for  them.  To  our  knowledge  such  a  theory 
is  largely  nonexistent.  These  models  are  usually  classified  as  linear 
nonstationary  models  so  in  a  sense  they  fall  outside  the  scope  of  this 
paper.  However,  we  will  show  that  at  least  in  special  cases  it  is 
possible  to  use  the  theoretical  framework  developed  in  this  paper  to 
obtain  properties  of  parameter  estimates. 

We  only  look  at  a  first  order  model,  although  this  is  not  an 
essential  restriction,  and  we  assume  that  {X^}  is  given  for  all  t  by 

Xt  -  a(t,a)Xtl  =  et  (3.70) 

2 

where  {e^}  consists  of  zero-mean  iid  variables  with  Efe^)  =  a2, 
and  where  a(-,a)  is  a  deterministic  function  depending  on  a  scalar 
parameter  a.  In  this  subsection  we  will  use  the  superscript  0  to  denote 
the  true  value  of  a. 

Theorem  3.5:  Let  { X^ }  be  given  by  (3.70),  where  a(t,a)  is  three  times 

continuously  differentiable  in  an  open  set  A  containing  the  true  value 
0  4 

a  of  a.  Assume  that  Efe^)  <  «,  and  that  there  exist  positive  constants 
m,  M  and  g  with  g<l  such  that  for  all  t 

3a (t,  a°) 

|a(t,a  )  |  <  g  and  -  _>  m  (3.71) 

3a 


and  such  that 


3a^(t,ot) 


(3.72 


for  all  t,  a  e  A  and  i=0,l,2  and  3. 

/N 

Then  there  exists  a  sequence  of  estimators  {a^}  such  that 


a  -*■  a  ,  and  such  that  a  is  obtained  by  minimization  of  0  in  (2.21 
n  n  J  mi 

A 

as  described  in  the  conclusion  of  Theorem  2.1.  Moreover,  a  is 

n 


asymptotically  normal. 


Proof:  Using  (3.72)  we  can  express  {X^}  as  a  mean  square  and  almost 
sure  convergent  expansion 

00 

xt  'X  atiVi  (3-73 

i=0  1 

i-1 

with  a  =1  and  a  .  =  IT  a(t-j,a)  for  i  >  0.  Moreover,  it  follows  from 

j=0 

(3.71)  and  the  mutual  independence  of  the  e^'s  that 


E(xJ)  =  a2  l  |at.(a°)|2  <  — 

i=0  1-g 


(3.74 


From  (3.70)  and  (3.73)  we  have  that 


Xtjt-1  =  a(t’a^Xt-l  and  ft|t-l  =  °*  t3,75 

and  since  3XXt j ^  j/301  =  3a1 (t, a) /9a1* X^  j,  it  follows  from  (3.72), 

(3.74)  and  (3.75)  that  CN1  and  CN2  of  Theorem  2.1  are  fulfilled. 

Considering  the  expression  in  CN4  of  Theorem  2.1,  it  is  seen  that 

by  the  mean  value  theorem  and  (3.72)  it  is  sufficient  to  prove 


,  n  _  a.s. 

-  Y  x2  <  co 


lira  sup  -  l  X 
t=l 

n  ■+■  « 


(3.76 


in  order  to  have  CN4  satisfied.  On  the  other  hand  from  the  equality 

2  , 

part  of  (3.74)  we  have  E(Xt)  >_  a ,  and  using  (3.71)  and  the  inequality 
part  of  (3.74)  it  follows  that  both  CN3  and  CN4  of  Theorem  2.1  will  be 
fulfilled  if  we  can  prove 


-  I  xj  -  -  l  E(X^) 
n  t  n  t 


a.s. 

->  0. 


(3.77) 


But  for  s  >  0  we  can  use  (3.73)  and  (3.74)  to  show  that  for  Y^=X^-E(X^) 
we  have 


E(Yt|Fte_s)  =  .^/(t-i^0)  {Xt-s-  E<Xt-s)} 


(3.78) 


It  now  follows  from  (5.71),  (3.73)  and  E(et)  <  °°,  that  there  is  a  K 

4 

such  that  E(X^)  £  K  for  all  t.  Hence,  by  (3.71)  and  (3.78)  we  have 

E[{E(Yt | s)}2]  5  g2SXj  for  some  Kj ,  and  the  mixingale  convergence 
theorem  implies  (3.77).  The  consistency  part  of  the  theorem  follows 
from  Theorem  2.1. 

The  quantity  defined  in  (2.8)  and  used  in  the  proof  of 
asymptotic  normality  is  given  in  our  case  by 


n  I  9a(t ,a°)  I2 


R  =  a2  l  - 

n  t=2  9a 


E(Xj_i)  , 


(3.79) 


and  it  follows  at  once  from  E(XtJ  >_  a2  and  the  last  part  of  (3.71) 
that  DN1  of  Theorem  2.2  is  fulfilled.  To  show  that  DN2  of’ that  theorem 
holds  it  is  clearly  sufficient  to  show  that 


9a(t ,a°) 

2 

C2  x2 

i  n 

-  i  o2  y 

9a(t,a°) 

9a 

ct  Xt-1 

n  t=2 

9a 

E(X^_i)  -  0  (3.80) 


We  let  Z,  =  |9a(t,a  )/9a|2{cV  -  o2 E(X7  .)}.  Then  using  (3.73) 


and  (3.74)  and  independence  properties  of  {e^}  it  is  not  difficult  to 


show  that 


i  e  3a(t,a  )  s-2  2  0  2  2 

E(ztlFt_s)  *  — 55 -  a2  n^(t-l-i,a  Hx^-  E(X‘_s)}.  (3.81) 


From  (3.71),  (3.72)  and  E(X^)  <  K  it  follows  that  E[{E(Zt [ F®  s)}2] 


^g  Mo2K^  for  some  K^,  and  the  mixingale  convergence  theorem  implies 


(3.80).  The  proof  is  now  concluded  by  applying  Theorem  2.2. 


It  should  be  realized  that  the  conditions  (3.71)  and  (3.72)  are 


quite  restrictive.  Thus  it  is  not  completely  nontrivial  to  find 


explicit  examples  of  functions  satisfying  these  requirements. 


3.5  Other  models 


We  could  of  course  consider  nonstationary  versions  of  bilinear 


models.  But  we  still  face  the  same  obstacles  as  in  the  stationary 


case  (cf.  Section  4.3  of  Tl),  and  again  it  seems  that  more  progress 


has  to  be  made  on  the  problem  of  invertibility  before  serious  analysis 


of  estimates  can  be  undertaken  in  the  present  framework. 


For  the  model  studied  by  Aase  (1983),  however,  our  theory  is 


applicable  and  the  conditions  CN1-CN4  and  DN1-DN2  results  in  conditions 


which  are  similar  to  his,  although  he  considers  some  slightly 


different  estimates. 


4.  A  maximum  likelihood  type  penalty  function. 


The  maximum  likelihood  type  penalty  function  was  studied  in  Section 


5.1  of  Tl  and  is  given  in  the  multivariate  case  as 


t=m+l  t=m+l 


•  ■  .  V"  -  • .  ■  -  ■  ■  •  *.  .*•  -  •  /  ■ .  ,  -  • 


B 
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The  resemblance  to  a  Gaussian  log  likelihood  and  the  martingale 
property  of  was  discussed  in  T1 .  If  X^-X^  j ^  1  is  not  independent 

X 

of  minimization  of  Ln  will  in  general  produce  estimates  with 

different  properties  from  those  obtained  using  conditional  least  squares 
with  Qn  as  in  (2.2).  This  is  the  case  for  doubly  stochastic  time 
series  models  and  in  particular  for  random  coefficient  autoregressive 
series . 

Corresponding  to  Theorem  5.1  of  T1  and  Theorem  2.1  of  the  present 
paper  we  have 

Theorem  4.1:  Assume  that  {X^.tel}  is  a  d-dimensional  process  with 
E{|Xt|2}  <  »  for  tel  and  such  that  X^.  | ^  ^(3)  and  f^j^  ^(3)  are  almost  surely 
twice  continuously  differentiable  in  an  open  set  B  containing  3°.  Moreover, 
assume  that  there  are  positive  constants  Mj  and  M2  such  that  for 
t  >  m+1 


EN1 :  E- 

^t  0 

—  (B  ) 

1 

93i 

- 

and 


EN2 :  E 


2 

3  <J> 


1  (6°)  -  E 


33i33j 


a2* 


0.  If-X 


30i 93 j 


(S  ) I 


iM2 


for  i,j=l,...,s,  and  where  expressions  for  these  derivatives 

are  given  in  (5.8)  and  (5.9)  of  T1 . 

Furthermore,  we  assume 
n  „  a.s. 

EN3:  lim  inf  X  .  (3  )  >  0 

n  -»•  °° 

where  Xn.  (3^)  is  the  smallest  eigenvalue  of  the  symmetric 


matrix  C  (8  )  with  matrix  elements  given  by 

cii  (6°)4£  Tr'ftit  i(s0)  !(b°)— t*t~1(e°) 

1J  n  t=m+l  t|t_1  36.  1  36. 

1  3 


+  2 — -  l-1-— (B°) f~|  (8°) 

38.  1  36. 

i  3 


EN4:  Let  =  {8:  | 0-6  j  <  6}  be  contained  in  B.  Then 


(4.2) 


n  3  2<f> 


lim  sup  (n<S) 

n-x»  54.0 


-(8)  - 


t=m+l[3ei3e3 


3  <fr  0  a.s. 

- — (6  )  •  <  c 

36i36j 


for  i, j  =  1, . . .  ,s. 

A 

Then  there  exists  a  sequence  of  estimators  (8^)  minimizing 

of  (4.1)  such  that  the  conclusion  of  Theorem  2.1  holds. 

Proof:  As  in  the  proof  of  Theorem  2.1  our  proof  consists  in  referring 

our  stated  conditions  back  to  the  conditions  of  theorem  2.1  of  T1 . 

3<j>t  x 

From  Proposition  5.1  of  T1  we  have  that  {-g^—  ,  F^}  is  a  martingale 

difference  sequence,  and  from  a  version  of  the  martingale  strong  law 
(Stout  1974,  Th.  3.3.8)  it  follows  from  EN1  that  (P0)^^4,8^ 

as  n  +  ®,  and  A1  of  Theorem  2.1  of  T1  is  fulfilled. 

The  sequence  {[32<J>  (8°)/38.3B.  -  E{ 32cJ>  (8^)/38. 38 .  |  F*  }],  F*} 

t  1  J  b  1  J  l "  1  l 

is  trivially  a  martingale  difference  sequence,  and  EN2  implies  via  the 
just  quoted  law  of  large  numbers  that 


.  32L  n  n  32<b  n  v  a.s. 

"  >  -tLiETO(e  )lFt-M 


On  the  other  hand  from  (5.11)  of  T1  we  have  that 


(4.3) 
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L 


n-1£  E{92(J)  (B°)/93.93.  |FX  }  =  Cn.(B°)  with  Cn(6°)  as  in  (4.2) 

t=m+l  x  J  t-i 


It  follows  from  (5.12)  of  T1  that  C  (3  )  is  non-negative  definite  and 


A2  of  Theorem  2.1  of  T1  now  follows  from  EN3  and  (4.3).  Finally,  EN4 


is  just  a  restatement  of  A3  of  Theorem  2.1  of  T1 


As  for  Theorem  2.1  the  conditions  EN1  and  EN2  may  be  weakened. 


We  next  turn  to  asymptotic  normality  and  to  the  analogs  of  Theorems 


5.2  of  T1  and  2.2  of  the  present  paper.  We  let  S',  R'  and  T'  be  the 


matrices  defined  by 


n  9<J>  9<J> 

c  i  _  c  •  =  y  _ L _ i. 

n  n,ij  L  98.  96.  ’ 


(4.4) 


t=m+l 


i  3 


n  9<*>t  94>t 

T'  =  T'  .  .  =  E(S '  .  .)  =  l  E  ^  ~ 
n  n, ij  n.xj  93i  98. 


(4.5) 


R*  =  R'  ..  =  nCn.  =  l  E<  - -  |F.  . 

n  XJ  t=m+!  9Bi98j  t_1 


t  I  rA 


(4-6) 


Here  expressions  for  9^/98^  and  E(9<|>t/38?9<J>t/36j)  are  given  in  (5. 


and  (5.18)  of  T1  and  for  C  in  (4.2). 


Theorem  4.2:  Assume  that  the  conditions  of  Theorem  4.1  are  fulfilled 


and  assume  in  addition  that 


FN1 :  lim  inf  n  S  det  { R '  ( 6*^) }  >  0 


n  ->  00 


! :  (R*  (6°)  >_,i  S’  (B°)  (R' (0°)  } ^  I. 


Let  (6  }  be  the  estimators  obtained  in  Theorem  4.1.  Then 
n 


(R’(80)}"!5T11(80)  (8-6°)  -  W(0,I  ) 


.  * .  A  A,  *. A  v*.  w  _A  A  -  _  A  A  At-  A’ooV  - 


Proof:  This  is  essentially  identical  to  the  proof  of  Theorem  2.1  and 
is  therefore  omitted. 


5.  Two  examples. 

In  Section  6  of  T1  it  was  seen  that  in  the  stationary  case  it 
was  possible  to  weaken  the  conditions  on  the  moments  of  random  coefficient 
autoregressive  processes  when  the  maximum  likelihood  type  penalty 
function  was  used.  The  following  examples  indicate  that  this  continues 
to  hold  true  for  nonstationary  doubly  stochastic  processes.  Only 
consistency  will  be  studied,  and  the  superscript  0  for  true  values  will 
be  dropped. 

5.1  A  random  coefficient  autoregressive  process. 

We  will  study  the  first  order  model  given  by  (3.19),  but  now  we 

2 

will  make  the  assumption  that  E (b^ )  =  y  >  0  is  a  constant,  and  we  will 
consider  the  problem  of  estimating  both  a  and  y. 

2 

Theorem  5.1:  Let  {X^>  be  as  in  (3.19)  with  E(bt)  =  y.  Assume  that 

2 

there  exist  two  positive  constants  m^  and  such  that  mj<E(et)<Mj 

2  As  As 

and  that  a  +  y  <  1.  Then  there  exists  a  sequence  of  estimators  {Ta  ,y  1}  such 

n  n 

/\  A  d  ♦  S  •  A  A 

that  ([an,  Yn]}  [a,y]  and  such  that  [an>Yn]  is  obtained  by  minimization 

of  L  in  (4.1)  as  described  in  the  conclusion  of  Theorem  2.1. 
n 

Proof :  For  the  process  treated  in  this  theorem  we  have  that  defined 
in  (4.1)  is  given  by 


■  ln(ftit-i>  *  (xt  -  <5-» 


where  Xt|t-J  =  aXt  ]  and  =  YX^_j  +  at  with  gt  =  E(ep  . 


We 


have 


a<j>t 

“5a  =  “  2(Xt‘aXt-l)/ftit-l 


(5.2) 


and 


36 


34>t 

W 


3f 


t  t-1 


3f 


3y 


t|  t-1 

Here  l/ft . t  l  <_  l/gt  £  l/m1,  while 


(Xt-aXt-l] 


Lt  t-1 


nizi 

3y 


(5.3) 


3f 


t  t-1 


t  t-1 

3Y 


't-1 


YXt-l  +  8t 


<  I 

-  Y 


(5.4) 


Using  similar  arguments  (cf.  also  Section  6  of  Tl)  it  is  not 

difficult  to  show  that  the  expectation  of  the  absolute  values  of  the 

first  and  second  order  derivatives  of  <J>  with  respect  to  a  and  Y  are 
2 

bounded  by  K^EfX^)  where  is  a  constant.  However,  using  independence 
properties  of  (b  }  and  {e  }  it  is  seen  from  (3.20)  that 


E(X?)  =  l  Ca2+Y)igt  i 

i=0  1 


<  (l-a2-Y)”1M;1 

^  n  2  x-1 

>  (1-a  -Y)  mx 


(5.5) 


2 

where  we  have  also  used  a  +Y<1.  It  follows  that  EN1  and  EN2  of 

Theorem  4.1  are  fulfilled. 

2 

Since  y>°  and  a~+Y<l,  there  exists  an  open  set  B  that  contains 

the  true  parameter  vector,  and  is  such  that  the  closure  of  B  in 

2 

the  parameter  space  do  not  contain  y  =  0  and  a  +  y  =  1.  Using  the 
martingale  law  of  large  numbers  it  is  not  difficult  to  show  that 
there  is  a  constant  such  that 

1 


1  ?  (Vt-i  +  V' 

lm  sup  —  I  - 2 - 

n  -*•  00  n  t=2  YXt-1  +  gt 


a.s. 

< 


(5.6) 


when  a  and  Y  ar<>  contained  in  B.  Using  this  result  in  combination  with 
the  above  majorizations  and  with  the  expression  for  the  third  order 
derivatives  (cf.  formula  (6.8)  of  Tl)  we  have  that  EN4  follows  from 
the  mean  value  theorem. 

It  remains  to  check  EN3.  Since  ft|t  j  does  not  depend  on  a,  while 

X^|t  j  does  not  depend  on  y,  then  the  matrix  Cn  of  EN3  is  given  by 
the  diagonal  matrix 


0 


n  2X! 


=2  ^Xt-1  +  gt 


t=2  (YXt-1  *hVj 


Assume  that  there  is  a  subsequence  indexed  by  n^  such  that  as 


n.  -*  oo,  then 
1 


t.  O  v  ^ 

1  1  2X*  I 

l  Y  t-l 

n.  1  v2 
it=2  YXt-l  *  *t 


(5.8) 


2  2  -1 

Since  2X^  (yX^  ^  +  g  )  <_  2y  ,  we  can  use  dominated  convergence 


to  show 


"i  2X. 


1  A  - 

-  y 

n  '•  Z 


(5.9) 


1  t=2  YXt-l  +  gt, 


as  n^  -»  oo .  However,  from  (5.5)  we  have  that  E(X^)  is  bounded  uniformly 
from  below,  and  it  follows  that  there  exists  a  6  >  0  such  that 


2  2-1 

P  (Xt  ^m^(l-a  -y)  }  _>  <5,  with  m^  as  in  (5.5).  Moreover,  the  ratio 

2x/(yx  +  gt)  is  monotonically  increasing  in  x  >  0,  so  that 


YXt-l  +  gt 


2m  (l-a2-^)-1 


ymjd-a  -y)”  +  Mj 


(5.10) 


for  all  t,  where  is  as  in  (5.5).  But  this  contradicts  (5.9)  and 
we  must  have 


1  it 

lim  inf  =  lim  inf  —  Y 
_  11  n  L 


n  2X: 


a.s. 

>  0 


n  ->  00 


t=2  YXt-l  +  gt 


n  a,s’ 

It  follows  using  tue  same  argument  that  lim  inf  C^2  >  0,  and  the 

n  -*•  oo 


« 


.  .vv.'.’Va  •  • 


condition  EN3  of  Theorem  4.1  is  verified. 


5.2  A  doubly  stochastic  process 


We  will  only  treat  the  simple  example  studied  in  Theorem  3.4. 

Theorem  5.2:  Let  {X^,  t  _>  1}  be  as  in  Theorem  3.4  with  the  exception 

4 

that  we  replace  the  condition  E(Xt)  £  K  with  the  weaker  condition 
2 

E(X^)  _<  K  for  some  K.  Then  there  exists  a  sequence  of  estimators 

A  A  A 

cL  S 

{a  }  such  that  a  '  a  and  such  that  a  is  obtained  by  minimization 
n  n  n  1 

of  L  in  (4.1)  as  described  in  the  conclusion  of  Theorem  2.1. 
n 

Proof :  Again  we  have  that  the  functional  form  of  <p  is  given  by  (5.1), 
but  now  with  i t  j  and  f  ^  i  ^  given  by  (3.50)  and  (3.53).  It  follows 


~  =  -2{xt  -  ^t-PVi1!1  +^r1Jxt-i/ft|t-i 

k  k 

and  using  the  fact  that  3  m^  3a  =  0  for  k  >  1  we  have 


(5.12) 


>t  3Vl]2  2 

7  =  2  *  +  ~ 3a  Xt-l/ftlt-l 


1  |FX 
2  1  t- 


(5.13) 


while  higher  order  derivatives  of  <J)  are  zero.  From  (3.53)  we  have 

1 / ft | t  i  £  1/a2.  On  the  other  hand  it  was  proved  in  the  proof  of 

Theorem  3.4  that  |3mt  /3a j  is  bounded  above  by  a  constant  independent 

~  2  2 

of  t,  and  since  E(Xt  -  X^|t  £  E(Xt) ,  it  now  follows  that 

E(  1 34>t/9a! )  £  Kj  for  some  constant  .  Thus  condition  EN1  of  Theorem 

4.1  holds.  Conditions  EN2  and  EN4  are  trivially  satisfied  and  it 
remains  to  verify  EN3. 

The  matrix  Cn  of  F.N3  in  this  case  reduces  to  a  scalar,  namely 


/  / 


It  was  established  in  the  proof  of  Theorem  3.4  that 

2  a. s.  2  9 

lim  inf  (1  +  3m  /9a)  >  0  and  that  y  £  2b  S“.  Reasoning  in  the 

t  -*■  oo  T  ^ 

same  way  as  in  the  last  part  of  the  proof  of  Theorem  5.1  it  is  concluded 
a.s. 

that  lim  inf  C  >  0,  and  the  theorem  is  proved.  || 

n  -*  00 

Unfortunately  we  have  not  been  able  to  prove  asymptotic  normality 
for  any  of  the  two  examples  treated  here.  The  difficulty  lies  in 
verifying  condition  FN2  of  Theorem  4.2. 

6.  Summary  remarks 

In  this  paper  as  well  as  in  T1  we  have  developed  a  general  frame¬ 
work  for  analyzing  estimates  in  nonlinear  time  series  models.  We  have 
given  applications  to  a  number  of  different  model  classes  and  tried 
to  deduce  sufficient  conditions  for  strong  consistency  and  asymptotic 
normality  from  the  general  conditions.  Our  conditions  reduce  to  the 
standard  set  of  conditions  (cf.  Fuller  1976,  Ch.  8)  in  the  linear  case, 
except  that  we  do  not  necessarily  require  a  homogeneous  residual 
process  {et>. 

2 

Explosive  behavior,  e.g.  E(Xt)  increases  as  an  exponential 
function  of  t  as  t  +  ®,  is  not  permitted  in  the  present  set  up.  It 
should  be  noted,  however,  that  Lai  and  Wei  (1982,  1983)  have  recently 
proved  consistency,  but  not  asymptotic  normality,  of  parameter  estimates 
in  linear  explosive  models.  It  is  sometimes  difficult  to  find  conditions 
guaranteeing  nonexplosive  behavior  for  nonlinear  models,  and  it  is 
therefore  a  challenging  task  to  try  to  extend  Lai  and  Wei's  results 


to  nonlinear  series. 


Our  work  has  potential  applications  in  several  other  directions. 
One  would  be  to  extend  our  results  to  more  general  classes  of  examples 
especially  in  the  doubly  stochastic  case.  Another  important  problem 
is  that  of  hypothesis  testing,  in  particular  in  connection  with 
empirical  identification  of  models. 
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